Back

Molecular Biology and Evolution

Oxford University Press (OUP)

Preprints posted in the last 90 days, ranked by how well they match Molecular Biology and Evolution's content profile, based on 488 papers previously published here. The average preprint has a 0.21% match score for this journal, so anything above that is already an above-average fit.

1
An Empirical Bayes approach for the study of phenotypic evolution from high-dimensional data

Montoya, P.; Fabre, A.-C.; Goswami, A.; Morlon, H.; Clavel, J.

2026-03-16 evolutionary biology 10.64898/2026.03.13.711658 medRxiv
Top 0.1%
43.7%
Show abstract

Multivariate phylogenetic comparative methods for modelling high-dimensional traits such as 3D shapes or gene expression proBiles have been recently developed. However, these approaches are impractical and almost impossible to use when the number of traits exceeds a few thousands, as they become computationally prohibitive. We overcome these limitations by proposing a new maximum likelihood approach based on the Empirical Bayes framework. This approach takes into account the information of the complete covariances (among species and traits) to infer parameters and compare models of trait evolution for high-dimensional datasets. Through simulations, we demonstrate that the proposed approach can accurately estimate parameters of various trait evolution models, even when the number of traits is ten times larger than the number of lineages; it requires less memory and is at least 10 times faster than currently available approaches. This fast, efBicient framework enabled us to extend the high-dimensional multivariate phylogenetic comparative toolkit by including an Ornstein-Uhlenbeck process with multiple optima to study adaptation to various selective regimes. Applying our approach to the evolution of jaw morphology in relation to dietary adaptation in mammals, we demonstrate morphological convergence in carnivorous and herbivorous lineages. The proposed Empirical Bayes framework, implemented in the R package mvMORPH, enables phylogenetic comparative methods to efBiciently handle high-dimensional datasets and complex models of trait evolution.

2
Modeling Site-Specific Mutation Patterns in Pandemic-Scale Phylogenetics

Martin, S.; Ly-Trong, N.; Minh, B. Q.; Goldman, N.; De Maio, N.

2026-05-04 evolutionary biology 10.64898/2026.04.30.721865 medRxiv
Top 0.1%
42.1%
Show abstract

Models of genome evolution often account for different evolutionary rates at different genome positions due to, e.g., varying selective pressures or mutation rates. Recent evidence from millions of publicly shared SARS-CoV-2 genomes has revealed a more complex mutational landscape than can be modeled with existing approaches. Here, mutation rates are in fact not only highly position-specific, as currently modeled, but also nucleotide-specific; for example, specific mutations can occur very often at certain determined genome positions, while at the same positions other mutations might not be highly recurrent. Here, we propose and investigate a general model of genome evolution where each genome position is allowed to evolve under an independent, non-normalized substitution rate matrix describing site-specific rates of all mutation types ("Site-Specific Matrix" model, or SSM). We implement SSM in the efficient pandemic-scale phylogenetic inference software CMAPLE. Large-scale genomic epidemiological simulations suggest that, given enough data, SSM can accurately infer position- and nucleotide-specific substitution rates for more frequently observed nucleotides (typically the reference nucleotide), while other rates require higher levels of divergence. Simulations also show that SSM has a modest impact on the accuracy of phylogenetic tree estimation. We use SSM to analyze the evolution of millions of SARS-CoV-2 genomes and observe substantial mismatches between the substitution rates of classical rate variation models and our SSM estimates. These results suggest that classical models of rate variation are inadequate for modeling site-specific mutation patterns and that SSM is a useful alternative for large-scale genome analyses.

3
Evolutionary rate correlations reveal long-term co-evolutionary interactions in Drosophila melanogaster

Dagilis, A. J.; DiAngelis, B.; Lee, S.; Matute, D. R.

2026-05-23 evolutionary biology 10.64898/2026.05.21.726714 medRxiv
Top 0.1%
32.3%
Show abstract

Co-evolution between genes can occur for a variety of reasons, including co-expression of genes, epistatic interactions between them, physical interactions of gene products and many others. Co-evolutionary partners of a gene are therefore of great interest in identifying potential factors that contribute to any phenotype of interest. State-of-the-art approaches to detect these interactions use correlations of evolutionary rates across a broader phylogeny, and so by necessity identify interactions only among genes that are present across long evolutionary time periods. This makes the methods unwieldy when interest lies in a single focal organism in which the genes of interest may have evolved in the recent evolutionary past. Here, we present a new approach to calculating evolutionary rate correlations which focuses on extracting maximum coverage for a single focal species, while retaining signals of co-evolution across large clades. We show how this approach is able to identify potential interactions even in highly studied species and highly studied genes, with a focus on the D. melanogaster sex-determiner, Sxl, using data from 72 species of Dipterans.

4
Evolution of regulatory networks controlling plasticity in gene expression between Saccharomyces cerevisiae and Saccharomyces paradoxus

Redhuis, A. C.; Wittkopp, P. J.

2026-05-20 evolutionary biology 10.64898/2026.05.18.725926 medRxiv
Top 0.1%
32.2%
Show abstract

Organisms cope with environmental changes by modifying gene expression. To understand how regulatory networks controlling expression plasticity evolve, we analyzed RNAseq data from Saccharomyces cerevisiae, Saccharomyces paradoxus, and their F1 hybrids at multiple timepoints after transferring cells from standard laboratory conditions to five environments (low phosphorus, low nitrogen, hydroxyurea shock, heat stress, and cold stress) and during the diauxic shift. In each of the six datasets, we identified genes that changed expression following the transition to the new environment and used hierarchical clustering to identify genes that increased or decreased in expression. We then compared these classifications between orthologs to identify genes with divergent plasticity. For some genes, plasticity was more extreme in one species than the other, and for others, expression of orthologs changed in opposite directions when acclimating to the same environment. Most cases of plasticity divergence were seen only in one environment and were attributable primarily to trans-regulatory divergence. Using environment-specific regulatory networks inferred from data in Yeastract, we found that divergent plasticity of environment-specific transcription factors generally did not predict divergent plasticity of their target genes. We also found that, as a group, genes with conserved plasticity tended to have more regulatory interactions than genes with divergent plasticity. Interesting patterns of expression divergence were also observed for five transcription factors in the pleiotropic drug resistance network and their target genes that might contribute to phenotypic divergence. Together, these findings show how environment-specific trans-regulatory divergence and combinatorial gene regulation shape the evolution of expression plasticity.

5
The B-value calculator: expected diversity under background selection

Marsh, J. I.; Daigle, A. T.; Johri, P.

2026-03-06 evolutionary biology 10.64898/2026.03.04.709642 medRxiv
Top 0.1%
27.5%
Show abstract

Background selection (BGS), resulting from indirect effects of selection at linked and unlinked conserved sites experiencing purifying selection, is a key evolutionary force that shapes patterns of diversity across genomes. Calculating the expected diversity at neutral sites under BGS relative to that under strict neutrality (referred to as B-value or simply B) is important for understanding processes that shape patterns of genomic variation and for developing null models when performing inference, in particular, demographic inference and detection of selective sweeps. We extend and integrate previous theory to estimate B-values analytically, assuming no selective interference. Here, we present the B-value calculator, Bvalcalc, an easy-to-use command-line interface written in Python for efficient analytical calculation of expected B genome-wide at single base-pair resolution. Bvalcalc has several modules for calculating diversity as a function of distance from a single conserved element or considering the multiplicative effects of all conserved elements across the genome, accounting for recombination maps, gene conversion, self-fertilization, single population size changes, and unlinked effects from other chromosomes, where specified. We validated the effectiveness of Bvalcalc with comparisons against simulated results, and generated B-maps for the model species Homo sapiens, Drosophila melanogaster and Arabidopsis thaliana as a proof of concept using public data. Bvalcalc is available with documentation at johrilab.github.io/Bvalcalc/.

6
Evolution of ion channels in the water-to-land transition of vertebrates

Uribe, C.; Riadi, G.; Opazo, J. C.

2026-04-23 evolutionary biology 10.64898/2026.04.08.717291 medRxiv
Top 0.1%
27.3%
Show abstract

The transition of vertebrates from aquatic to terrestrial environments represents one of the most profound evolutionary events in their history, involving extensive physiological and morphological innovations. Key adaptations included the transformation of fins into limbs with digits to enable efficient terrestrial locomotion, the ability to perceive novel environmental stimuli, and the emergence of reproductive strategies suited to life on land, processes in which ion channels played fundamental roles. Accordingly, understanding the genetic basis of vertebrate terrestrialization requires investigating the evolution of this group of membrane proteins. Our analyses reveal that the proportion of ion channel genes is highly conserved, representing approximately 1.4% to 1.6% of total protein-coding genes in most lineages, with a notable increase to [~]1.9% in teleost fishes. Our natural selection analyses revealed an overrepresentation of specific ion channel gene families, including TRP, RyR, HTR3, and HCN. We identified 29 ion channel genes showing signatures of positive selection, many of which are associated with key physiological functions such as nociception and thermosensation. We also detected an elevated rate of gene turnover in the common ancestor of terrestrial vertebrates, indicative of substantial genomic remodeling through gene gain and loss. Together, these findings suggest that, despite overall conservation in the proportions of ion channel genes, specific gene families underwent changes that were likely critical to meeting the physiological demands of terrestrial life. These results provide a foundation for future comparative and functional studies aimed at elucidating the molecular mechanisms underlying major environmental transitions.

7
The dynamics of silent variation in Mimulus guttatus: Codon usage bias and linked selection

Madrigal Roca, L. J.; Kelly, J. K.

2026-05-20 evolutionary biology 10.64898/2026.05.18.725996 medRxiv
Top 0.1%
24.9%
Show abstract

O_LISynonymous nucleotide variation, which is remarkably high in Mimulus guttatus, can be affected by both codon usage selection (translational efficiency) and linked selection (hitchhiking effects). C_LIO_LICodon usage reflects a genome-wide tug-of-war between mutational pressure toward A/T-ending codons and weak selection favoring G/C-ending codons. The outcome is determined largely by gene expression level and localized variation in recombination rate. C_LIO_LIUsing both mechanistic (ROC-SEMPPR) and population genetic models, we find that most genes are weakly selected for codon usage, about 76% yielding scaled selection coefficients (S = 4Nes) in the range of 0 to 1. Additionally, 4029 genes, primarily involved in photosynthesis, translation, defense, and phosphate scavenging, experience strong selection (S > 1). C_LIO_LILevels of nucleotide variation within genes indicate a strong effect of linked selection. Non-synonymous polymorphism declines in genes with strong purifying selection, and as the rate of (intra-genic) recombination declines. Levels of synonymous polymorphism usually track non-synonymous (owing to background selection), except in genes under the strongest translational selection. C_LIO_LICounterintuitively, we find that codon usage selection has a generally positive effect on synonymous nucleotide diversity at 4-fold degenerate positions. Since mutation strongly disfavors the optimal base in M. guttatus, codon selection in the range of 0 < S < 2 evens the balance (between selection and mutation) and thus inflates heterozygosity. C_LI

8
Benchmarking BEAGLE to find optimal parameters for BEAST X

Fosse, S.; Duchene, S.; Duitama Gonzalez, C.

2026-03-12 bioinformatics 10.64898/2026.03.10.710534 medRxiv
Top 0.1%
22.8%
Show abstract

Bayesian phylogenetic analyses are notoriously time-consuming, largely because exploring the posterior distribution requires computing Felsensteins likelihood. The BEAGLE library is a high-performance computational tool that dramatically accelerates the calculation of such likelihoods by leveraging parallel processing on GPUs, multicore CPUs, and SSE vectorisation. Here we present results from benchmarking a widely popular phylogenetics package, BEAST X, using BEAGLE integration, focusing on how hardware allocation affects running times. We demonstrate substantial differences among BEAGLE settings on real Dengue Virus (DENV) data, both with and without partitioning. Using simulated sequences, we establish guidelines for GPU usage in BEAST X runs. These guidelines can be used for effective resource allocation for empirical analyses and simulation studies.

9
Multiple molecular and cellular properties jointly affect protein and site-specific evolutionary rates

Saini, A.; Usmanova, D. R.; Supo Escalante, R.; Vitkup, D.

2026-05-23 evolutionary biology 10.64898/2026.05.20.726710 medRxiv
Top 0.1%
22.7%
Show abstract

Protein evolutionary rates vary widely across proteins and among sites within proteins, reflecting multiple molecular, cellular, and functional constraints. While protein-level properties, such as expression and essentiality, and site-level structural and functional constraints, are known to influence evolutionary rates, how these constraints combine across scales to determine site-specific evolutionary rates remains unclear. Moreover, because many protein features are strongly correlated, it is difficult to disentangle their individual contributions to evolutionary rate variance, and unified predictive models that integrate these properties are still lacking. Here, we use neural networks to predict protein evolutionary rates across multiple scales based on multiple molecular and cellular features. At the protein level, integrating molecular and cellular descriptors explains substantial variance in evolutionary rates across proteins in multiple eukaryotic species, including nearly 50% of the variance in humans and substantial fractions of the variance in other eukaryotic species. The model also allows us to identify proteins whose evolutionary rates deviate from expectations based on their molecular and cellular properties. At the site level, we found that structural and functional features explain a comparable fraction of the variance in relative evolutionary rates. By integrating protein-level and site-level predictors, the model explains up to 37% of the variance in site-specific evolutionary rates across proteins. Our analysis demonstrates that constraints at these two scales combine largely additively, with protein-level properties setting the overall evolutionary context and site-level properties shaping variation within proteins. Together, these results provide a quantitative framework for understanding protein evolution across biological scales.

10
A simple test demonstrates that many prokaryotic accessory genes are adaptive.

Eyre-Walker, Y. C.; Conradsen, C.; Vos, M.; Eyre-Walker, A.

2026-03-18 evolutionary biology 10.64898/2026.03.16.712127 medRxiv
Top 0.1%
22.1%
Show abstract

Bacterial genomes often contain many genes that are only present in a subset of strains, the so-called accessory genes. Whether these genes are adaptive, neutral or deleterious remains contentious. Here we introduce a simple test to differentiate between these possibilities. If an accessory gene is adaptive then the sequence of the gene should be conserved, and the ratio of non-synonymous to synonymous diversity,{pi} n/{pi}s, should be less than one. In contrast, if the gene is neutral or deleterious, selection should not conserve the gene sequence, and{pi} n/{pi}s should equal one. We apply this test to accessory genes in Escherichia coli and Staphylococcus aureus; two highly divergent bacterial species with a large and a small pangenome respectively. We find{pi} n/{pi}s<1 for genes at all frequencies in both species demonstrating that many are adaptive. We estimate that at least 75% of all the accessory genes are maintained by selection in the two samples of 500 genomes that we have analysed, equating to thousands of adaptive accessory genes in both species, a substantial increase on previous estimates.

11
Genomic dialects: How amino acid properties and the second codon base shape the informational accents of life

Martinez, O.; Ochoa-Alejo, N.

2026-04-24 bioinformatics 10.64898/2026.04.21.720023 medRxiv
Top 0.1%
22.1%
Show abstract

Codon Usage Bias (CUB) is a fundamental feature of genomic architecture, reflecting a balance between mutational pressure and natural selection. We propose a "genomic dialects" framework, where species-specific CUB profiles represent "informational accents" constrained by biochemical and structural requirements. Utilizing a normalized informational index based on Shannons entropy, we analyzed CUB profiles for 18 amino acids across 1,406 species from the three domains of life. Linear models were employed to investigate the relationship between CUB and physicochemical properties, including Saiers second-codon-base classification, molecular volume, hydrophobicity, aliphatic/aromatic status, and dissociation constants. CUB distributions are highly skewed, with > 52% of values below 0.1, suggesting a near-optimal use of the genetic codes potential. We demonstrate that amino acid properties significantly influence CUB, with Saiers classification explaining up to 69% of variance in Archaea and{approx} 47% across all taxa. Hydrophobic amino acids (Q1 class) consistently exhibit higher average CUB than hydrophilic ones, particularly in microbes. Individual species models reveal extreme correlations; for example, in the alga Chlamydomonas reinhardtii, Saier classes explain > 95% of CUB variance. Finally, we show that CUB-based dendrograms represent phenetic similarity ("genomic accents") rather than reliable phylogenetic reconstructions, as they rarely coincide with the true Tree of Life. Our findings indicate that the "rules" of genomic dialects are largely anchored in the dual requirements of translational fidelity and protein stability. The observed "informational accents" are proximately governed by the metabolic and genomic machinery under the constraints of the drift-barrier hypothesis. This study provides a robust framework for understanding how the physical realities of amino acids have shaped the evolution of the genetic codes informational use across the tree of life.

12
Linking genotype to longevity under genealogical discordance in Sebastes rockfishes

Mo, Y. K.; Sudmant, P. H.; Hahn, M. W.

2026-03-30 evolutionary biology 10.64898/2026.03.26.714443 medRxiv
Top 0.1%
22.0%
Show abstract

AO_SCPLOWBSTRACTC_SCPLOWRockfishes (genus Sebastes) show extreme variation in longevity among closely related species, but the evolutionary history of this young radiation is highly complex. To unpack these relationships and to associate genotypes with phenotypes, we quantified genealogical discordance among 55 Sebastes species and implemented a phyloGWAS framework that incorporates discordant gene histories into genotype-longevity association tests. We found that genealogical discordance is extremely high: the inferred species tree topology differed among several ILS-aware methods, with most internal branches having low concordance factors regardless of which method was used. Nevertheless, some phylogenetic structure was shared by all inferred species trees. We used simulations to assess the statistical properties of phyloGWAS applied to complex traits using different genetic relatedness matrices (GRMs) and under varying levels of discordance. Adding an accurate GRM reduced false positives relative to a model without relatedness, but GRMs only modestly increased power to detect true positives. Using multiple approaches on the Sebastes data, phyloGWAS identified several variants associated with longevity. Our results indicate that extreme genealogical discordance is a core feature of Sebastes evolution and that phyloGWAS can help in connecting genotype to phenotype under these conditions.

13
Interpreting GC content differences across populations at polymorphic sites

Chandra, S.; Gao, Z.

2026-05-18 evolutionary biology 10.64898/2026.05.16.725686 medRxiv
Top 0.1%
21.9%
Show abstract

Recent studies have reported consistent inter-population differences in GC content at polymorphic sites in multiple species, including humans. Specifically, populations that experienced recent bottlenecks exhibit lower average GC content (GC%) at common polymorphic sites compared to non-bottlenecked groups--an observation previously interpreted as indication of rapid evolution of base composition. In this study, we investigate the evolutionary and technical factors driving these patterns across humans, mice, maize, and silkworm. We find that GC% at polymorphic sites is highly sensitive to the allele frequency threshold applied. Relaxing this threshold reduces inter-population differences to negligible levels in humans and significantly attenuates similar signals in other species. We further observe substantial GC% variation across allele frequency bins, a pattern driven by the differential abundance of different mutation types. We demonstrate that these observations are collectively driven by an interaction between demographic history and a universal excess of strong-to-weak mutations relative to weak-to-strong mutations, which is counteracted by GC-biased gene conversion (gBGC) over long evolutionary timescales. Forward-in-time simulations with realistic parameters recapitulate observed patterns of GC% variation across both populations and allele frequency bins. Overall, our findings reveal that the base composition at polymorphic sites is strongly shaped by the interaction between demographic history, mutation bias, and gBGC, and does not represent stable, genome-wide trends. Consequently, inter-population differences in GC content--especially at common variants--should not be interpreted as evidence of ongoing divergence in base composition or shifts in mutation patterns.

14
Human ancestors interbred with two distinct populations of distant relatives

Rogers, A. R.; Islam, M. T.; Brand, C. M.; Webster, T. H.

2026-03-23 evolutionary biology 10.64898/2026.03.22.713509 medRxiv
Top 0.1%
21.9%
Show abstract

Ancient DNA has shown that a distantly-related "superarchaic" population interbred first with the ancestors of Neanderthals and Denisovans and later with Denisovans themselves. Other work has shown that a superarchaic population interbred with the African ancestors of all modern humans. But it is not yet clear whether these events involved the same superarchaic population. Here, we use the distribution of derived alleles among populations to evaluate hypotheses about superarchaics and their relationship to other hominins of the Pleistocene and Holocene. We find evidence for at least two distinct superarchaic populations. The one contributing to archaic Eurasian populations (Denisovans and Neanderthal-Denisovan ancestors) diverged earlier from the human lineage than did the one contributing to early moderns in Africa. These findings reveal previously unrecognized structure among hominin populations of the Pleistocene.

15
Convergent gene erosion in the chemical defensome of marine mammals

Danneels, B.; Oliveira, D. O.; Castro, F. L. C.; Karlsen, O. A.; Ruivo, R.; Goksoyr, A.

2026-05-23 genomics 10.64898/2026.05.21.726804 medRxiv
Top 0.1%
21.9%
Show abstract

To preserve homeostasis in the face of continual chemical insult, animals evolved dedicated molecular systems that detect, detoxify, and eliminate foreign compounds. Collectively, these enzymes, transporters, and regulatory pathways constitute the chemical defensome. In cetaceans, the loss of two key nuclear receptors (NR1I2/PXR and NR1I3/CAR) suggests a profound rearrangement of the chemical defense systems. Therefore, we investigated the gene inventory of the chemical defensome in Cetacea and two other major marine mammal lineages (Pinnipedia and Sirenia), using their closest terrestrial relatives to understand the extent and patterns of chemical defensome remodelling. We demonstrate large-scale gene loss in chemical defensome genes of cetaceans, as well as smaller scale gene loss in the other two marine mammal lineages, indicating possible convergent evolution. Gene loss occurred predominantly in phase I and phase II biotransformation enzymes, including CYPs, FMOs, SULTs, and GSTs. Many of the lost genes in cetaceans are known to be regulated by PXR and/or CAR, while genes lost in multiple marine mammal lineages are often not regulated by these transcription factors. We hypothesize that the transition to aquatic environments, often accompanied by corresponding changes in feeding habits, led to convergent loss of chemical defensome genes, and loss of PXR and CAR in cetaceans accelerated these losses. These findings reveal systematic erosion of chemical defense capabilities across marine mammal lineages, suggesting that adaptation to marine life involves trade-offs in detoxification capacity that may have significant implications for these species responses to increasing chemical pollution in present-day ocean environments.

16
MKado: a toolkit for McDonald-Kreitman tests of natural selection

Rivera-Colon, A. G.; Rehmann, C. T.; Kern, A. D.

2026-03-04 evolutionary biology 10.64898/2026.03.02.709122 medRxiv
Top 0.1%
21.8%
Show abstract

SummaryMKado is a Python toolkit for performing McDonald-Kreitman (MK) tests of natural selection from aligned coding sequences. It implements the standard MK test as well as a wide variety of its extensions and related statistics, including a number of methods for estimating the fraction of adaptive substitutions () while accounting for slightly deleterious mutations, with a unified command-line interface and Python API. MKado supports parallel batch processing of thousands of genes with near-linear scaling, and provides publication-ready visualizations including volcano plots and asymptotic curves. Availability and ImplementationMKado is freely available at https://github.com/kr-colab/mkado under the MIT license. Full documentation is available at https://mkado.readthedocs.io. MKado is implemented in Python and installable via pip. Contactadkern@uoregon.edu

17
Sub-cellular Systems Drift Drives Mosaic Evolution of Mammalian Neurons.

Rosario, J. G.; Kim, J.

2026-03-31 evolutionary biology 10.64898/2026.03.27.714927 medRxiv
Top 0.1%
21.5%
Show abstract

Evolution of the mammalian brain has been described as mosaic evolution wherein natural selection for behavioral function promotes independent evolution of specific functional units despite developmental constraints that might govern overall change1,2. Evidence of mosaic evolution has been reported at the level of gene expression in individual structures3, cell type abundances4, as well as gene regulatory changes at the single cell level5-7. In particular, it has been hypothesized that brain evolution involves changes in circuit organization6,8. Circuit-level changes involve sub-cellular compartments that mediate synaptic activity, raising the question whether mosaic brain evolution might be found at the sub-cellular scale. Here, we examine the rate of evolutionary divergence between Mus musculus (C57BL/6) and Rattus norvegicus (Sprague-Dawley) for their dendritic transcriptome, which shapes the post-synaptic proteome through sub-cellular localization and local translation9. We address the problem of variable assessment of the dendritic transcriptome by micro-dissecting individual hippocampal pyramidal neurons to create matched single cell libraries of the soma and the dendrites from the same cell and apply a machine learning model to predict localization. Our results show that the dendritic transcriptome is significantly more divergent than the soma, but the core functional roles of the dendritically localized genes are conserved. Examining gene families for their localization suggests enrichment of family level conservation or localization. We propose that the observed divergence may arise from a combination of adaptive modulation and system drift under selection for core function. Our study suggests fine-grained mosaic evolutionary dynamics at the scale of synaptic function that mediates information processing and neural connectivity.

18
bifrost: an R package for scalable inference of phylogenetic shifts in multivariate evolutionary dynamics

Berv, J. S.; Fox, N.; Thorstensen, M. J.; Lloyd-Laney, H.; Troyer, E. M.; Rivero-Vega, R. A.; Smith, S. A.; Friedman, M.; Fouhey, D. F.; Weeks, B. C.

2026-04-14 evolutionary biology 10.64898/2026.04.12.718036 medRxiv
Top 0.1%
18.4%
Show abstract

O_LIHigh-dimensional comparative datasets, including geometric morphometric landmarks, functional traits, and other large trait datasets, are increasingly common in biology. When these datasets include a large number of traits relative to the number of taxa, they pose significant challenges for phylogenetic comparative analysis. In addition, evolutionary dynamics are often heterogeneous across phylogenies, challenging researchers to develop tools that can localize and account for such variation when investigating hypotheses of evolutionary change. C_LIO_LIWe present bifrost, an R package for detecting and characterizing shifts in multivariate trait evolution across phylogenetic trees. bifrost implements a stepwise greedy search over alternative macroevolutionary regime configurations on a phylogeny. Candidate shifts are proposed and assessed at internal nodes, accelerated with parallel model fitting where possible, and aggregated sequentially when they exceed a user-defined information-criterion acceptance threshold. C_LIO_LIThe underlying model is a scalar-rate multivariate Brownian motion process fit by generalized least squares using mvMORPH::mvgls [1]. Our framework also provides support estimates for individual shifts using information-criterion weights. C_LIO_LIWe illustrate the workflow using a fossil-tip-dated phylogeny and high-dimensional landmark data for early bony fish jaws (32,508 scalar coordinate values), and discuss tuning, outputs, and limitations. bifrost extends existing phylogenetic comparative frameworks for evolutionary analysis and provides a scalable pipeline for exploring the phylogenetic natural history of large multivariate datasets. C_LI

19
When South meets North: a joint contact zone coinciding with environmental gradients in three boreal tree species

Herrera Egoavil, P.; Leal, J. L.; Zhou, Q.; Milesi, P.; Lascoux, M.; Yildirim, B.

2026-03-17 evolutionary biology 10.64898/2026.03.13.711305 medRxiv
Top 0.1%
18.0%
Show abstract

Post-glacial recolonization of Fennoscandia created secondary contact zones in many species, offering opportunities to study how gene flow and selection contribute to their establishment and maintenance. Here, we analyse genomic data from three boreal tree species--Picea abies, Betula pendula, and Pinus sylvestris--sampled along a latitudinal gradient in Sweden. Despite differences in colonization timing and dispersal ecology, all three species exhibit north-south genetic structuring aligned with environmental gradients. Most notably, the two main genetic clusters within each species overlap in a shared contact zone, corresponding to the climatic transition between Swedens two major environmental zones. The extent and structure of the contact zone differ among species: P. abies shows stronger genetic structure and moderate gene flow, B. pendula exhibits intermediate differentiation and gene flow, and P. sylvestris displays the weakest structure with stronger gene flow. All three species also show genomic signatures of local adaptation, with distinct underlying architectures. In P. abies, adaptive loci are broadly distributed across the genome, while, strikingly, they are mostly found within an inversion on chromosome 1 in B. pendula. In P. sylvestris, local adaptation likely relies on subtle allele frequency shifts across many loci with weak signals. These patterns align with theoretical expectations for polygenic local adaptation under varying migration regimes. Our comparative approach demonstrates how gene flow and selection jointly shape genomic landscapes in shared environments and contributes to understanding local adaptation in forest trees, with implications for predicting species responses to climate change.

20
EMPIRE: The Ellipse Model for Phylogenetic Inference of Range Evolution

Swiston, S. K.; McHugh, S. W.; Landis, M. J.

2026-04-24 evolutionary biology 10.64898/2026.04.23.720387 medRxiv
Top 0.1%
18.0%
Show abstract

AO_SCPLOWBSTRACTC_SCPLOWMany phylogenetic models of historical biogeography exist for describing how lineages move and evolve over time. Here, we present the Ellipse Model for Phylogenetic Inference of Range Evolution (O_SCPLOWEMPIREC_SCPLOW), which models the movement and splitting of species range ellipses in continuous space, summarizing important attributes of each range, such as its position, size, and orientation. The framework allows us to reconstruct ancestral range ellipses, investigate rates governing important processes like movement, expansion, and elongation, and examine the spatial context of speciation, including asymmetric range inheritance at cladogenesis. We apply O_SCPLOWEMPIREC_SCPLOW to the Australian Sphenomorphinae, a group of skinks whose diversification has coincided with substantial climatic change over the past ~36 million years. We find that speciation events are positively associated with aridification, while daughter lineages post-speciation do not tend to show evidence of ecological partitioning.